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Abstract 

A 2-matching of a graph G is a spanning subgraph with maximum degree two. The size 
of a 2-matching U is the number of edges in U and this is at least n — k(U) where n is the 
number of vertices of G and k denotes the number of components. In this paper, we analyze 
the performance of a greedy algorithm 2GREEDY for finding a large 2-matching on a random 
3-regular graph. We prove that with high probability, the algorithm outputs a 2-matching U 
with k(U) = 9 (n 1 / 5 ). 



1 Introduction 

In this paper we analyze the performance of a generalization of the well-known Karp-Sipser algo- 
rithm [13, 12, 1, 4] for finding a large matching in a sparse random graph. A 2-matching U of a 
graph G is a spanning subgraph with maximum degree two. Our aim is to show that w.h.p. our 
algorithm finds a large 2-matching in a random cubic graph. The algorithm 2GREEDY is described 
below and has been partially analyzed on the random graph G^^ n , c > 10 in Frieze [9]. The random 
graph is chosen uniformly at random from the collection of all graphs that have n vertices, 

m edges and minimum degree 5(G) > 3. In [9], the 2-matching output by the algorithm is used to 
find a Hamilton cycle in 0(n 1 - 5+ °^) time w.h.p. Previously, the best known result for this model 
was that G^? n is Hamiltonian for c > 64 due to Bollobas, Cooper, Fenner and Frieze [7]. It is 
conjectured that G^^ n is Hamiltonian w.h.p. for all c > 3/2. 

The existence of Hamilton cycles in other random graph models with 0(n) edges has also been 
the subject of much research. In such graphs, the requirement 5 > 3 is necessary to avoid three 
vertices of degree two sharing a common neighbor. This obvious obstruction occurs with positive 
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probability in many models with 0(n) edges and 5 = 2. G3_ ou t is a random graph where each vertex 
chooses 3 neighbors uniformly at random. This graph has minimum degree 3 and average degree 6. 
Bohman and Frieze proved that G3_ ut is Hamiltonian w.h.p. also by building a large 2-matching 
into a Hamilton cycle [3]. Robinson and Wormald proved that r-regular graphs with r > 3 are 
Hamiltonian w.h.p. using an intricate second moment approach [14], [15]. Before this result, Frieze 
proved Hamiltonicity of ?"-regular graphs w.h.p. for r > 85 using an algorithmic approach [10]. An 
algorithmic proof of Hamiltonicity for r > 3 was given in [11]. 

In addition to the Hamiltonicity of G 5 ^ n for 3/2 < c < 10, the Hamiltonicity of random graphs 
with 0(n) edges and a fixed degree sequence is a widely open question. One natural example is the 
Hamiltonicity of a graph chosen uniformly at random from all the collection of all graphs with n /2 
vertices of degree 3 and n/2 vertices of degree 4 (this particular question was posed by Wormald). 
For both Gf^ and graphs with a fixed degree sequence one might hope to prove Hamiltonicity by 
first using 2 GREEDY to produce a large 2-matching and then using an extension rotation argument 
to convert this 2-matching into a Hamilton cycle. In this paper we provide evidence that the first 
half of this broad program is feasible by showing that 2GREEDY finds a very large 2-matching for 
the sparsest of the models with minimum degree 3, the random cubic graph itself. 

The size of a 2-matching U is the number of edges in U and this is at least n — k(U) where k 
denotes the number of components. It was shown in [12] that w.h.p. the Karp-Sipser algorithm 
only leaves 0(n 1//5 ) vertices unmatched. Here we prove the corresponding result for 2greedy on 
a random cubic graph. 

Theorem 1.1. Algorithm 2GREEDY run on a random 3-regular graph with n vertices outputs a 
2-matching U with k(U) = ©(n 1 / 5 ), w.h.p. 

We prove Theorem 1.1 using the differential equations method for establishing dynamic concentra- 
tion. The remainder of the paper is organized as follows. The 2greedy algorithm is introduced in 
the next Section, and the random variables we track are given in Section 3. The trajectories that 
we expect these variables to follow are given in Section 4. A heuristic explanation of why 2greedy 
should produce a 2-matching with roughly n 1 / 5 components is also given in Section 4. In Section 5 
we state and prove our dynamic concentration result. The proof of Thereom 1.1 is then completed 
in Sections 5, 6, and 7. 

2 The Algorithm 

The Karp-Sipser algorithm for finding a large matching in a sparse random graph is essentially 
the greedy algorithm, with one slight modification that makes a big difference. While there are 
vertices of degree one in the graph, the algorithm adds to the matching an edge incident with such 
a vertex. Otherwise, the algorithm chooses a random edge to add to the matching. The idea is 
that no mistakes are made while pendant edges are chosen since such edges are always contained 
in some maximum matching. The algorithm presented in [9] is a generalization of Karp-Sipser 
for 2-matchings. Our algorithm is essentially the same as that presented in [9] applied to random 
cubic graphs. A few slight modifications have been made to ease the analysis and to account for 
the change in model. We assume that our input (multi-)graph G = G([n],E) is generated by 
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the configuration model of Bollobas [6]. Let W = [3n] be our set of configuration points and let 
Wi = [3(i — 1) + l,3i], i G [n], partition W. The function cj) : W — > [n] is defined by w G WM w y 
Given a pairing F (i.e. a partition of W into m = 3n/2 pairs) we obtain a (multi-)graph Gir with 
vertex set [n] and an edge ((j)(u), <f>(y)) for each {u, v} G i 7 . Choosing a pairing i 7 uniformly at 
random from among all possible pairings £1 of the points of W produces a random (multi-)graph 
Gp. It is known that conditional on Gp being simple, i.e. having no loops or multiple edges, 
that it is equally likely to be any (simple) cubic graph. Further, Gp is simple with probability 
(1 — o(l))e -2 . So from now on we work with G = Gp. 

We only reveal adjacencies (pairings) of Gf as the need arises in the algorithm. As the algorithm 
progresses, it grows a 2-matching and deletes vertices and edges from the input graph G. We let 
r = (Vt,Ey) be the current state of G. Throughout the algorithm we keep track of the following: 

• U is the set of edges of the current 2-matching. The internal vertices and edges of the paths 
and cycles in U will have been deleted from T. 

• b(v) is the 0-1 indicator for vertex v G [n] being adjacent to an edge of U. 

• Y k = {v G V T : d r (v) = k, b(v) = 0}, k = 0, 1, 2, 3. 

• Z k = {v G V T ■ d r (v) = k, b(v) = 1}, k = 0, 1,2. 

We refer to the sets Y3 and Z2 as Y and Z throughout. The basic idea of the algorithm is as 
follows. We add edges to the 2-matching one by one, which sometimes forces us to delete edges. 
These deletions may put vertices in danger of having degree less than 2 in the final 2-matching. 
Thus, we prioritize the edges that we add to U, so as to match the dangerous vertices first. More 
precisely, At each iteration of the algorithm, a vertex v is chosen and an adjacent edge is added 
to U. We choose v from the first non-empty set in the following list: Y\,Y2, Zi,Y, Z . As in the 
Karp-Sipser algorithm, taking edges adjacent to the vertices in Y\, Y2 and Z\ is not a mistake. We 
will prove that by proceeding in this manner, we do not create too many components. 

When a vertex v is chosen and its neighbor in the configuration is exposed it is called a 
selection move. Call the revealed neighbor, w the selection. The edge (v,w) is removed from 
r and added to U. If the selction w is a vertex in Z, then once (v, w) is added to U, we must delete 
the other edge adjacent to w. Hence we reveal the other edge in the configuration adjacent to w. 
Call this exposure a deletion move. 

Details of the algorithm are now given. 

Algorithm 2 GREEDY: 

Initially, all vertices are in Y. Iterate the following steps as long as one of the conditions holds. 
Step 1(a) Yy ^ 0. 

Choose a random vertex v of Y\. Suppose its neighbor in T is w. Remove (v,w) from T and 
add it to U. Set b(v) = 1 and move v to Zq. 

re-assign(w). 
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Step 1(b) Yy = 0,Y 2 7^0- 

Choose a random vertex v of Y 2 . Randomly choose one of the two neighbors of v in V to 
expose and call it w. 

If w = v ({v} comprises an isolated component in T with a loop), then remove (v,v) from T 
and move v from Y 2 to Y). 

Otherwise, remove (v,w) from T and add it to U. Set 6(f) = 1 and move it to Z\. 
re-assign(w;). 

Step 1(c) Yi = Y 2 = 0,Zi ^0. 

Choose a random vertex v oi Z\. v is the endpoint of a path in £/. Let u be the other endpoint 
of this path. Suppose the neighbor of v in T is w. Remove (v, w) from T and add it to U. 
Remove v from T. 

re- assign (w). 

Step 2 Yi = Y 2 = Zi = 0, Y ^ 0. 

Choose a random vertex « of Y. Randomly choose one of the three neighbors of v in T to 
expose and call it w. 

If w = v, then we remove (v, v) from T and move v to Y±. 

Otherwise, remove (v, w) from T and add it to U. Set b(v) = 1 and move it to Z. 
re-assign(w;). 

Step 3 Yi = y 2 = Zx = Y = 0, Z + 

The remaining graph is a random 2-regular graph on \Z\ many vertices. Put a maximum 
matching on the remaining graph. Add the edges of this matching to U. 

Subroutine re-assign(u;): 

1. Ub(w) = 0: 

Set b(w) = 1 and move w from Y to Z, Y 2 to Z\ or Y\ to Zq depending on the initial state of 
w. 

2. If b(w) = 1: 

Remove to from T. If w was in Z prior to removal, then the removal of w from T causes an 
edge (w,w'), to be deleted from T. Move w' to the appropriate new set. For example, if w' 
were in Z, it would be moved to Z\\ if w 1 were in Y, it would be moved to Y 2 , etc. 

3 The Variables 

In this section we will describe the variables which are tracked as the algorithm proceeds. Through- 
out the paper, in a slight abuse of notation, we let Y, Z, etc. refer to both the sets and the size of 
the set. Let M refer to the size of E-p. We also define the variable 

C := Yx + 2Y 2 + Z x . 
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If X is a variable indexed by i, we define 

AX(i) :=X(i + l)-X(i). 

3.1 The sequences a, 5 

We define two sequences a, 8 indexed by the step number i. a(i) will indicate what type of vertex 
is selected during a selection move, and 8(i) will do the same for deletion moves. 

Formally, a is a sequence of the following symbols: Y, Z, £, loop, multi. We will put a(i) = loop 
only when step i is of type 2 and the selection move reveals a loop. We put a{i) = multi only 
when step i is of type 1(c), and w = u G Z. The only way this happens is when v & Z±, u & Z, 
(v, u) G U, and the selection made at step i happens to select the vertex u. Otherwise we just put 
o~(i) = Y,Z,£ according to whether the selected vertex is in Y, Z, £. 

Note that the symbols loop, multi are for very specific events, and not just any loop or multi- 
edge. If step i is of type 1(6) and our selection move reveals a loop, then we put o~{i) = C- Also, if 
step i is of type 1(c) and the selection move reveals a multi-edge whose other endpoint is also in 
Z\ then we put a(i) = £ as well. 

8 is a sequence of symbols: Y, Z, £, 0. We will put 8(i) = when there is no deletion move 
at step i (i.e. when a{i) ^ {Z, multi}). Otherwise 8(i) just indicates the type of vertex that the 
deletion move picks (here we don't make any distinctions regarding loops or multi-edges). 

3.2 The variables A, B 

We will define the following two important variables: 

A:=Y + C 
B :=2Y + Z + (. 

A is a natural quantity to define, since the algorithm terminates precisely when A = 0. B is also 
natural because it represents the number of half-edges which will (optimistically) be added to our 
current 2-matching before termination. We will see that A and B are also nice variables in that 
their 1-step changes AA(i), AB(i) do not depend on what type of step we take at step i. We have 

AY(i) = -l C (j) =0 - l CT (i)=y - (t a (i)=z + ^a(i)=muiu) l<5(i)=y (3-1) 

AZ(i) = lf(j)=o + l(r(t)=y — ^-u{i)=Z — ^-a{i)=loop — ^-a(i)=multi 

- (l<r(i)=Z + ~ft-a(i)=multi) ^8(i)=Z (3-2) 
ACW = -lc(*)>0 + 1 cr(j)=«oop - l<r(i)=C 

+ {^a(i)=Z + ^-a{i)=multi) ( _ l<5(i)=C + 1 <5(i)=^ + 2 ' ^6(i)=y) ( 3 - 3 ) 

and note that these all depend on whether ( = (i.e. whether step i is of type 1 or 2). However, 

AA(i) = -1 - l CT (j) = y - l CT (j)=^ + l CT (i)=/oop - ^a(i)=multi + l<r(i)=Z 

- (l(j(i)=^ + ^a{i)=multi) 2 • l<5(i)=C ( 3 - 4 ) 

AB(i) = -2 + t a{j)= i oop - l S (i)=c (3.5) 
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which do not depend on whether £ = 0. For AA, we have used the identity 

ts=Y + 1«5=Z + 1<5=C = lo-=Z + l<T=nw^« 

which states that we make a deletion move if and only if our selection move was Z or multi. Note 
also that if we establish dynamic concentration on A, fi, ( then we implicitly establish concentration 
on Y, Z, M since 

Y = A - C (3.6) 
Z = B - 2A + C (3.7) 
2M = 3Y + 2Z + C = 2B - A. (3.8) 



4 The expected behavior of A 1 B, ( 

In this section, we we will non-rigorously predict the behavior of the variables and some facts about 
the process. Throughout the paper, unless otherwise specified, t refers to the scaled version of i, so 

i 

n 

Heuristically, we assume there exist differentiable functions a, b such that A(i) w na(t), B{i) nb(t). 
Further, we assume that £ stays "small". We will prove that these assumptions are indeed valid. 
We also let 

2Z 3Y C 

Pz '~ 2M' Py '~ 2M' PC '~ 2M 
where we have omitted the dependence on i for ease of notation. 

4.1 The trajectory b(t) 

Since B(0) = 2n, and recalling (3.5), we see that 

B(i) = 2n-2i + J2 {Uj)=ioo P ~ h(j)=c) ■ 

The probability that cr(j) = loop or 6(j) = C on any step j should be negligible. Thus we expect 

B(i) rs 2n - 2i = 2n(l - t) 

so we will set 

b(t) = 2(1 -t). 

4.2 The trajectory a(t) 

We derive an ODE that a should satisfy: 

6o(t) 



a'(t) » £?[AA(i)] « -1 - + Pz 



2b(t) - a(t)' 
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Note that we have used (3.4). Thus a(t) should satisfy 

a' = (4.2) 

4 -At -a V 7 

The substitution x = yields a separable ODE, which can be integrated to arrive at 

(a + 2- 2tf - 27a 2 = 0. 

So a(t) is given implicitly as the solution to the above cubic equation with coefficients depending 
on t. We may actually solve that cubic to get three continuous explicit functions ai(t), a,2(t), a^(t) 
(though the formulas are nasty to look at). From the initial condition a(0) = 1 and the fact that 
< a(t) < 1, it's clear that the solution we want is 

, fl fll + Ut + 2t 2 \ vr\ 

a(t) = 7 + 2t - 6V5 + At cos - arccos = — + - . 

V 3 V (5 + 4t)2 J 3y 

From here we can see that a(t) — > as t — > 1~. More precisely, 



■im^-m 1 . (4.3) 



\2 



t^ 1 " (1 - t) 

To confirm this, note that 

arccos (1 - e) = V2e + 0(e 3/2 

and 

11 4-1A(t — r\ -I- 9/1 _ r\ 2 

1 -^a + o(e 



11 + 14(l-e) + 2(1 - e) 2 _ 1 4e 3 , _ 4 . 



(5 + 4(l-e)) 3 / 2 729 
Rewriting the cos term using the angle addition formula and Taylor expansion, we see (4.3). Ad- 
ditionally, 

< 0. (4.6) 
Since a(0) = 1, for all < t < 1 we have 

3 

|V {l-tf 2 < a(t) < {l-tf 2 . (4.7) 



4.3 Downward drift of ( 

We expect £ to be "small", and to justify that claim we will show that whenever £ is positive, it 
is likely to decrease. Assume that > 0. In the following table, we make use of the fact that 
S(i) 7^ if and only if a(i) 6 {Z,multi}. So for example, 

1,5= Y = {^-a=Z + l(T=muZti)l($=y- 
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Then from (3.3) we see that if > 0, 



A(={ 



1 with prob. p z p y + O ( j^) 
with prob. p\ + O (jj) 
— 1 with prob. p y + O 
-2 with prob. p^ + + O (^) 
To illuminate this table, we provide an example. 

2Z 3Y 



P [AC(») = 1] = P [<r(i) = <S(i) = F] 
Therefore, roughly speaking we have 



2M - 1 2M - 3 " W + ° { M I • 



E[A((i)} =p z p y -Py + 0{p Q ) 



9a z 



and are motivated to define 



*(*) := 



9a" 



(26 - a) 2 

2 =e(i-t) 



(26 -a 

to represent the downward drift of £(i) (if it is positive) at step i. 



(4. 



(4.9) 



4.4 Expected behavior of ( 

In the last subsection we estimated E[A((i)] when ( > 0, using (4.8). We can also use (4.8) to 
estimate the variance when £ > 0. We see that 

w*r[Ac(;)|c > o] = e(pj,) = e ((i - . 

Thus, to model the behavior of we consider a simpler variable: a lazy random walk X T (k) with 
Xr(0) = 0, expected 1-step change E [AX T ] = — (1 — r) and l^arfAX,-] = (1 — r)a . After s steps, 
we have E [X T (s)] = — (1 — r)s and ^ar[X T (s)] = (1 — t)^s. There is at least constant (bounded 
away from 0) probability that X T (s) is, say, 1 standard deviation above its mean. However, the 
probability that X T (s) is very many standard deviations larger than that is negligible. In other 
words, it is reasonable to have a displacement as large as X T (s) = — (1 — r)s + (1 — t)±s*, but not 

11 3 

much larger. The quantity ip(s) := —(1 — t)s + (1 — r)4g2 is negative for s > (1 — r)~ 5 . Also ij)(s) 
is maximized when s = ^(1 — t)~2 ; where we have ip(s) = |(1 — r)~ 2. 

Now we reconsider the variable C- Roughly speaking, ^(z) behaves like the lazy random walk 
considered above, so long as we restrict the variable i to a short range (so that t does not change 
significantly), and we have > for this range of i. We have C(O) = 0, and £ has a negative drift 
so it's likely that £(j) = for many j > 0. Specifically, if j is an index such that £(_?') = 0, then 
we expect to behave like X T {i — j) with r = -, so long as i is not significantly larger than j. 

3 3 

Thus we expect to have = for some j < i < j + (1 — t)~2 . Also, for all j ; < % < j ' + (1 — t)~2 
we should have < ^(1 — r)~2 . But this rough analysis does not make sense toward the end of 

3 2 3 

the process: indeed, for j > n — ns (i.e. for 1 — r < ra~5), we have j + (1 — t)~ 5 > ri. However, 
we can still say something about what happens when j is large, since the variable s cannot be any 

3 1 

bigger than n — j. Now for j > n — ne and s < n — j we have ^(s) < ns . Thus, we never expect £ 
to be larger than ns, even towards the end of the process. 
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4.5 Why do we have f^O many components? 

At any step of the algorithm, we expect the components of the 2-matching to be mostly paths 
(and a few cycles). We would like the algorithm to keep making the paths longer, but sometimes it 
isn't possible to make a path any longer because of deletion moves. Specifically, for example, if one 
endpoint of a path is in Z\, and then there is a deletion move and the deletion is that endpoint, 
then that end of the path will never grow. If the same thing happens to the other endpoint of the 
path, then the path will never get longer, and will never be connected to any of the other paths. 
Similarly, the number of components in the final 2-matching is increased whenever the algorithm 
deletes a Y\ or a Y%. Thus we can bound the number of components in the final 2-matching by 
bounding the number of steps % such that 6(i) = C- 



i 




Roughly, P[<5(i) = C] = §m ' W = ® in mm ~ 2 > 7=t \) • ^° integrating, we estimate 
the total number of components as 

(1 — t) _ i, J = @(^logn 

5 The stopping time T and dynamic concentration 

In this section, we introduce a stopping time T, before which A and B stay close to their trajectories, 
and C does roughly what we expect it to do. We will also introduce "error" terms for both A, B 
and a "correction" term a for the variable A. For most of the process, a will stay smaller than 
the error term for A. However, toward the end of the process a will be significant. Using a in our 
calculations thus allows us to track the process farther. As it turns out, the variable B does not 
need an analogous "correction" term. 

We define the following random variables which represent "actual error" in A, B: 

e aW := — na(t) — a(i) 
e b (i) := B(i) - nb(t). 

7_ 6 

We define the stopping time T as the minimum of n — Ctu is log s n and the first step i such that 
any of the three following conditions fail: 

\e a (i)\<fa(t), (5.1) 
\e b (i)\<f b (t), (5.2) 

and for every step j < i such that ( is positive on steps j, . . . ,i, 

k 



CW < CO") - 2J * ( - ) + l i W < 5 - 3 ) 

j<k<i 



for some as-yet unspecified error functions f a ,fb^j and absolute constant Ct- Throughout the 
paper we will use C? to refer to unspecified but existent absolute constants. In subsection 5.6, we 
present actual values for these constants. 
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We anticipate that the conditions on £ will imply that for some function we have 

CW < ft (t) 

for all i < T. Our goal for now is to prove that for some suitable error functions, w.h.p. T is not 
triggered by any of the conditions (5.1), (5.2), (5.3). 

Theorem 5.1. With high probability, 

T = n — Cxn^ logs n. 

The remainder of this section contains the proof of Theorem 5.1. Here we define the error 
functions / a , /&, ft (up to the choice of constants). While these definitions are not very enlightening 
at this point, they will aid the reader in confirming many of the calculations that appear below. 
Those same calculations will motivate the choice of these functions. 



fait) := 
hit) :-- 

/<(*) := 



Ca(1 — i) 4 n2 loga n 

f(l-t)~3logn :1 

• < 1 4 

[-ns logs nlog(l — t) : otherwise 
min | (1 — t)~2 logn, ns logs n\ . 



1 1 

t > n~s logs n 



(5.4) 
(5.5) 
(5.6) 



5.1 A useful lemma 

We'll use the following simple lemma several times to estimate fractions. 



1 6x 1 




1 X 1 ' 


y 



< \, then 



x + e a 



x _ ye x - xe y Q ( ye x e y + xe y 



y + £ y y 



Proof. 



X -\- Ex X 

y 



y + E 



w 
w 

yti 



x 



1 + 



x 



1 + <bt 
y 



x y 



xy y 2 



XEy_ +0 ( V £ x£y + X£ A y 



□ 
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5.2 T is not triggered by A 



We define 



A+(i) := Aii) - na(t) - a(i) - f a (t) = e a (i) - f a (t) 

and let the stopping time Tj be the maximum of j, T, and the least index i > j such that e a (i) is 
not in the critical interval 

[9a(t)Ja(t)} (5.7) 

where < g a < f a is an as-yet unspecified function of n, t. Our strategy is to show that w.h.p. A 
never goes above na + a + f a because every time e a enters the critical interval, w.h.p. it does not 
exit the interval at the top. The use of critical intervals in a similar context was first introduced in 
[5]. 

Let T% be the natural filtration of the process (so conditioning on Ti tells us the values of all 
the variables, among other things). 

For i < T, we have from (3.4) that 

/■ 2Z 2Z C ( 1 

El AA(m] = -l-2M "2M+2M - 2 '2M -2M + °U 
M + (1 



2B-A (2B-A) 2 \2B-A {2B - A) 2 

6 (na + a + e ) 4£ [(na + a + e a ) + (raft + e&)] / 1 £ S 



2(n6 + e 6 ) - (na + a + e a ) [2(nb + e b ) - (na + a + e a )] 2 (25 - .4) 2 

6a 12ae 6 - 126 (a + e a ) 4(o + 6)C , „ / 1 « 2 + fa + /& + /A 



+ °—xr, \ ' + >, +Q — r + 



26 -a n(2b-a) 2 n(2b - a) 2 \n(2b-a) n 2 (2b-a) 2 I 

The last equality follows from Lemma 5.2. Note that the lemma actually implies that the big-0 
term includes mixed products of terms like a ■ for example. We have simplified by using the fact 
that for all real numbers x and y, \xy\ < | (x 2 + y 2 ) . We are now motivated to cancel out the £ 
term in the last line by recursively defining 



a(0) := (5.8) 
4(o + 6)C- 12ba(i) 
n(2b-a) 



a(i + 1) := a(i) + ' IZ~ IT ■ (5-9) 



From this definition and the defintion of fa, it follows that for i < T, 



n . A4(a+^ Jlogn(l-t)- 1 /2 for i<n-n 3 / 5 log 2 / 5 

U - aW - 2^ n (2h - a) 2 ~ a Ll/5 !„J>/5 „ r_ .3/ 5l _2/5_ 

as long as 



o n(26-a) 2 " a [n 1 / 5 log 9 / 5 n for n — ra 3 / 5 log 2 / 5 n < i < T. 



(5.10) 



C a > 8C C . (5.11) 
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Now for j < i < Tj, we have the supermartingale condition 

BiAAWd = mm?,] - At) - «('+»K-"f»w _ i_ m 

n(2b — ay n 

+ O ( ~a"(t) + itf(t)) (5.12) 

126 ffa _1 
" n(26-a) 2 n iaW 

+ °l v n(26-a)2 + ^(26^) + n 2(26 - a) 2 + n° W + ^^J ^ 

Note that in the last line we have used (5.9), the fact that e a > g a , and also that a satisfies the dif- 
ferential equation (4.2). By taking g a = Q (f a ), we see that A + (j), . . . , A + (Tj) is a supermartingale 
since 



i/; = -^(n-V2 log i/2 n(1 _ r i/4^ 



n(26 — a) 2 n* 

which dominates the big-O term in (5.13). 

We use the following asymmetric version of the Azuma-Hoeffding inequality (for a proof see 
[2]): 

Lemma 5.3. Let Xj be a supermartingale, such that —C < AX(j) < c for all j, for c < Then 
for any a < cm we have Pr(X m — Xq > a) < exp ^— 3 C ^ m ^ 

We have 

-2 < AA < 

and 

-2(1 - 4)3 < '(t) < 0. 
This follows from analysis of the function a(t). So 

i 

-2 < AA + < 2 ( 1 - - ' 

\ n 

for the supermartingale A + {j) ■ ■ ■ A + (Tj). Thus, if A crosses its upper boundary at the stopping 
time T, then there is some step j (with T = Tj) such that 



and A + (Tj) > 0. In this case, j is intended to represent the step when e a enters the crtical interval, 
(5.7). Applying the lemma we see that the probability of the supermartingale A + having such a 
large upward deviation has probability at most 



exp < 



fa( J n j 9a ( 3 n ) 2 



12n 1 - i 
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As there are O (n) supermartingales A + (j), . . . ,A + (Tj), we must choose f a ,9a to make the above 
probability o (^). The following choice suffices: 



3 1. 1 



fa(t) = C A (1 -t)4Jl2 l og 2 n 



9a(t) = ~Ja{t). 



as long as the constant Ca is chosen so that 



If we define 



A := A - na - a + f a = e a + f a 
then we may prove that A~ stays positive w.h.p. in a completely analogous fashion. 



(5.14) 



5.3 T is not triggered by £ 

Referring to (4.8), we may say that if ((i) > 0, 

E[A((i)\Ti\ = PzPy - Py + O { PC ) 



9A 2 



(2B - Ay 



+ 



C 



2B- A 



Now, before T we have 

9a 2 9A 2 



(26 -a) 2 (2B-A) 2 



A 



A a 
+ 



2B-A 2b-aJ\2B-A 2b - a 
2b(a + e a ) - 2ae b , „ { a 2 + f 2 + f 2 



n(2b - a) 2 
a 



O 



n 2 {2b-af 



2b -a 



2b(a + e a ) - 2aeb Q f a 2 + / 2 + / 2 



n(2b - a) 



n 2 (2b-a) 2 



36a(ae b - ba - be a ) Q ( oi 2 + / 2 + f\ 



n(2b-a) 3 \n 2 (2b-a) 2 
In the last step we have cleaned up the big-0 using the facts 



a + fa + fb 



o(l) and 



a 



2b -a 



O(l). 



n(2b-a) 

For every step j, we define a stopping time 

Tj := min{i(j),max(j,T)} 

where is the least index i > j such that £(i) = 0. Also, define a sequence Cj~(j) 
where 



(5.15) 



(5.16) 



j<k<i 



n 



n 
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where hj is some function we will choose that will make Ct{^) a supermartingale. Now for j < i < Tj, 
using (5.16), we have 

Q4 2 Qn 2 1 / t 1 \ 

EWWt = + - -m + o + ^) (5 - 17) 

36a(a/ 6 + 6/ a ) 1 , / a 2 + / a 2 + / 2 / c 



< 



n(2fo — a) 3 n J 

Note that 



^w +0 (^^ + ^feo + ^ w )- (5J8) 



36a(a/ 6 + 6/ B ) < ^ + , ^ ^ „ (1 _ t)i (g 1Q) 



(26 -a) 3 - V8 
so the choice 

makes the sequence a supermartingale as long as the constant Ch is chosen so that 

C h > ^C A . (5.20) 

Since hj = 0, we will always have C/C?) = C(j)- 

We'll use the following supermartingale inequality due to Freedman [8]: 

Lemma 5.4. Let X- L be a supermartingale, with AXj < C for all i, and V(i) := Var [AX (k)\J-k\ 

k<i 

Then 

( d 2 

P [3i : V(i) <v,Xi - X > d] < exp 

Referring to (4.8), before T we can put 

Far [AC/ Wi-Fi] = V ar[AQ(i)\F t ] 



2(v + Cd)J ' 



< E 



(AC«) 2 I Ti 



= 1 • PzPy + 1 • Py + 4 • (P£ + p z Pc) + f ]g J 

and note that before T, we have 

3Y 3^4 3[n(l-t)i +a + / ] / C a 

Pv = < 7 < — ^ < 1 + — r + o l (1 - t)» (5.21) 

^ 2M " 2B-A- 4n (l-t)-2f b -n(l-t)l-a-f a ~ 1 



r<2 



so we will just say p y < C p (1 — i) 2 for some constant C Py such that 



C Pv >\ + % (5.22) 

/nr 2 



Also, note that AC + < 2. 
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Suppose the variable £ triggers the stopping time T. Then there are steps j < i = T such that 
C > all the way from step j to step i, and C/W > — hj(t). We'll need to apply the lemma 
to the supermartingale Q~ to show this event has low probability (guiding our choice for £j). Note 
that in the lemma we can plug in the following for v. 



V(i) = Var[A(f(k)\T k ] < 3C Py 1 

j<k<i ^ 

So the unlikely event has probability at most 



exp < 



(*i -hi? 



> . 



As there are O (n 2 ) pairs of steps j,i we'd like to make the above probability o (^)- Towards 
this end we consider 2 cases. 



If ( 1 — ^ ) * (i — j) 2 < log 2 n, then it suffices to put lj — hj = Ce log n as long as 

(5.23) 



r 2 



6C Py + 4C e 



~ n ) * ~~ ■?) 2 > ^°S 2 n ' * nen ^ su ffi ces to put £j — hj = C t ( 1 - | ) 4 (i - j) 2 log 2 n. 



Thus we choose 



:= /ij(t) + Qmax <( logn, I 1 - - ) (i - j) 2 log 2 n 



With this choice, w.h.p. T is not triggered by (. 



5.4 An upper bound on C 

In this section we'll motivate our choice of the function 

— 3 2 

Lemma 5.5. W.h.p. for all j < n — 2C£n$ logs n such that — 1) = 0, we have 
1- C(j') = for some j < j' < j + C X (l— *Q 2 logn, and 

2. C(i) < 2C| (l - |) ~ 5 log n /or a// j < i < f - 1 
Proof. Suppose £(j — 1) = 0. Note that we then have £(j) < 2. <I>(t)/(l — t) is decreasing since 



d {<S>(t)\ ( 3a 
' w » ^ 2 



(it V 1 - t 




(26 - a) • 3 (- 



6a 
2b~a 



3a -4 + 



6a 
26-a 



(26-a) 2 (l-i) 



+ (!-*)' 



3a 



26 -a 



(1 -t) 2 (26-a) z 



< 0. 
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Also, using (4.3) and the defintion of b, 



lim 



*(t) _ 1 



t->i- 1 — £ 6 

Hence > |(1 - i) for all < t < 1. If we substitute x= i -=f then 



j<k<i v 7 



2ra 



1 



1 



(2 — j) > H — n I 1 ) £ 



12 



6 



11 



Plugging in the value of £j(t), we have that for any i> j such that . . . C(i) are all positive, 



j<k<i 



< TO 

~ 12 



1 (1 J 
-n 1 

6 \ ra 



ChTi2 log 2 n I 1 — — 



+ Ci max ^ log n, ( 1 — — j na log 2 nx2 > + 2. 



(5.24) 
(5.25) 



Consider (5.25) for x = Xj := C x n Mogra ^1 — . As long as C x > 1 andj < n — 2C x n^> logs n, 
we have 



j \ 4 1 1 1 

1 n 2 log 2 rax? > logn 

ra / J 



so we can evaluate the "max" in £j. Also note that the coefficient of x is dominated by — gn ^1 — ^ 
so the coefficient of x is at most, say — (l — *Q. Thus (5.25) gives 

-3 



CO' + naij-) < ^n- 1 log 2 n ( 1 - f- 



.7 



n 



^-Qv^jlogra(l-£) 2 +2 



which is negative for this range of j as long as we pick C x such that 

C x 



CpJCr. > 0. 



(5.26) 



Therefore, C must have hit again before step i = j + rax.,-. This proves the first part of the lemma. 

To prove the second part, consider (5.25) for j < i < j + nxj (i.e. for < x < xj). If 
x < n^ 1 log ra ( 1 — £ ) 2 then we can put 



12 

and for x larger than that, we'll put 



C — ni 2 — — n ( 1 — — ) x + Ci log ra < S 



COO < Y^nx 
c 



1 



-ra 1 



x + C(\ 1 



712 log 2 nxz 



< ^L n -i W 2 n 1 - J -\ + 



12 



< 2Cj ( 1 - J - 

n 



v 



7C 2 



ra 



logra 



log n. 
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2 

where to justify the second line we use the inequality Cyfx — dx < £j for real numbers x, c, d > 0. □ 

-3 2 

We would also like to say something about for i > n — 2C£n$ logs n. 

1 4 

Lemma 5.6. There exists a constant such that w.h.p. for alii <T we have < C^ns logs n. 

— 3 2 

Proof. Suppose step f > n — 2Cxnslogsn with C(j') = 0. It follows from Lemma 5.5 that 
w.h.p. such a f exists. Let i > j' such that C(j') • • • C(0 are an positive. Note that we again 

■/ — 2 2 

have the bound (5.25). But now < x < < 2C;! n _ s logs n, and (5.25) gives (,{%) < 

±cj + 2*Cf,cf\ ns logs n. □ 

So in particular we can say that for % < T we have 

COO — /c(*) = min|(l — i)~2 logn,ras logs n| , 

where 

C c > max |2Cf, |c| + 2*C e C§ 5 | (5.27) 

5.5 T is not triggered by B 

Recall from (4.1) that 



e& 



(*) - X] ( 1 ff(0=k°P ~ 1 5(i)=c) • 



First we'll bound 1<S(3')=C' Define B (i) := — l<5(j)=C + ~^fb{t)- Then 

97 a 1 / 1 

£?[Afl-(t)|Ji] = -— + —fUt) + O -xfL'Ct) 

1 Wl J 2M 2M-2 2n JiW Vn 2 ^ 17 



> ft 



V^ (t)+ °G^' (t) )' 



n(26 

Note that by (4.7), 

3(1 -t) < 2b- a < 4(1 -t) 

so we can put 



{(1 — t) 2 log?i : 1 — t > n s logs n 

14 i /-1 \ u - ( 5 - 28 ) 

— ns logs nlog(l — t) : otherwise 



and -B will be a submartingale as long as 



C B > \c c . (5.29) 
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We'll apply Lemma 5.4 to —B . Note that before T we can put 
Var[AB-(i)\Ti] = Var[l s ®=c\?i\ 

<pc< k < k 

— 4n(l — t) — na — a — f a — ft, ~ 3n(l — t) 
and therefore, referring to V(i) as in Lemma 5.4, 

k 



So for u we will plug in 



3n(l - t)" 



{1 2 2 

(l-t)~2logn : l-t >n-5logs n 
i 9 , (5-30) 

ns log 5 n : otherwise. 



which is an upper bound on V(i) as long as 



C VB > -C c (5.31) 



Note \AB | < 1, so the probability that — B (i) > is at most 



exp 



2 

_ f 

4-lb 



1 f 2 



which is o (^) as long as 



So w.h.p. for all i < T, we have 



2 [v + \h 



in 2 

3 B > 1. (5.32) 



^l^) =c </ 6 (i). 



The sum lo-(j)=2oop presents less difficulty, since w.h.p. the configuration has at most C^logn 

j<i 

loops total. So we can trivially say that 

lcr(j)=loop < fb(t) 

j<i 

and hence w.h.p. the stopping time T is not triggered by variable B. 



5.6 Values for the constants 

Throughout the proof above, we collect various constraints on the constants in (5.11), (5.14), (5.20), 
(5.22), (5.23), (5.26), (5.27), (5.29), (5.31) and (5.32). The reader may chack that the following 
values satisfy all the conditions. 

C A = 16, C h = 20, C Py = 2, C e = 12, C x = 8000, 

C c = 800, C a = 70000, C VB = 700, C B = 1200, C T = 2000. 
This completes the proof of Theorem 5.1. 
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6 Upper bound on the number of components 



In this section we prove the following lemma which provides the upper bound for the proof of 
Theorem 1.1: 

Lemma 6.1. W.h.p. the algorithm outputs a 2-matching with O ^ns logs 71J components. 

Proof. The components of our 2-matching at any step i consist of cycles and paths (including paths 
of length 0). First we'll bound the number of paths in the final 2-matching. Note that these final 
paths have both endpoints in Zq, meaning that each endpoint had a half-edge in £ that got deleted 
(or for paths of length there is only one vertex which is in Yq). So to bound the number of these 
paths, we bound the sum lg^ = ^. Note that in light of Section 5.5, we have the bound 

j 

^2 1 <5(i)=C = ( n ^' lo ^' n 

Next we'll bound the terms corresponding to steps after T, but before A = 0. By Theorem 5.1 
we have w.h.p. 

A(T) = O (n* log§ n 

since 

< a(T) = O ink logt n 



by (5.10), and 



ii a 



^)' /a (^) =0 ( nV5l ° g9/5n ) 



Now note that by (3.4), on each step j such that <r(j) E {Z,multi} and 5(j) = £, the variable A 
decreases by 2. Also, the variable A is nonincreasing. Therefore there can be at most O |ns logs n^J 
such steps j until ^4 = 0. 

Once we have ^4 = 0, the algorithm finds a maximum matching on the remaining random 2- 
regular graph T. Thus, to complete the bound on the number of paths in the final 2-matching, 
we'll bound the number of vertices in T that are unsaturated by the matching (i.e. the number 
of odd cycles in the remaining 2-regular graph T). But V has at most O (log re) cycles total, since 
it's a random 2-regular graph. Thus, the sum l^-^, and therefore the number of paths in the 

j 

final 2-matching, are O |ns logs n^j . 

Now we bound the number of cycles in the final 2-matching. Note that at any step, the 
probability of closing a cycle is at most 2 ai-i ■ Therefore, the number of cycles created for the 
whole process is stochastically dominated by the random variable 

3n 

i=i 

where 



with prob. i 

J (6.1) 
with prob. —j- . 
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So if we define the martingale 

then we have Var[AC(i)] = ^r, and note Y2i=i ^r" = 0(\og(3n)). Now, applying Lemma 5.4 to 
C{i) shows that w.h.p. it is always at most 0(log2 n), and since E[C] = O(logn), we have that 
C = O(logn) w.h.p.. 

□ 



7 Lower bound on the number of components 



In this section we will prove that near the end of the process, there is a non-zero probability that 
£ becomes large and stays large for a significant amount of time. In this case, the algorithm will 
likely delete an edge adjacent toa( vertex. In particular, we will prove the following lemma which 
provides the lower bound and thus completes the proof of Theorem 1.1: 

Lemma 7.1. W.h.p. the algorithm outputs a 2-matching with Q (^ii~> log" 4 ra^ components. 

Proof. We show that £ stochastically dominates a suitably defined martingale and then apply the 
following central limit theorem of Freedman. 

Lemma 7.2. Let Si be a martingale adapted to the filtration Ti with Xi := Si — Si-i, \Xi\ < C 
for some constant C, and let Vi := Ylk<i V ar [^fel^fe— l]- Tor each n, let < 7 n < j' n be real 
numbers, and let a n be a stopping time. As n — >■ oo, suppose j n — > oo and j' n /i n — > 1 and 
IP [in < Va n < 7n] ~~ > 1- Then S an /^% converges in distribution to AA(0, 1). 



Let 



wit) 



3a(i/n) 



2b(i/n) — a(i/n) 

In this section we will consider steps from i$ = n — n 3 / 5 to i en d 
From Theorem 5.1, w.h.p., T occurs after this time frame. Hence we have dynamic concentration 
on our variables and can say in this range, 



n — n 3 / 5 + n 3 / 5 log 1 n < n — ^n 3 / 5 . 



Py{i) = w(i) — 0(n 2 / 5 logn) 
P( -(i) = 0(n~ 2/5 log n) 
Pz(i) = 1-Py(i) -PcW- 



(7.1) 
(7.2) 
(7.3) 



Note that in this range we also have w(i) = 0(n 1 / 5 ). Our martingale will have independent 
increments given by 



X(i) 



1 with prob. w(i) — Ln~ 2 ^ logn 

with prob. 1 — 2w(i) — Ln~ 2 / 5 logn 

— 1 with prob. w(i) — Ln~ 2 ^logn 

—2 with prob. 3Ln -2 / 5 logn 



(7.4) 
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where L is a positive contstant large enough that for all iq < i < i en d 

p z (i)p y (i) > w(i) - Ln~ 2/5 logn, p z (i)p y (i) +p z {if > 1 - w(i) - 2Ln" 2/5 log 

and 



n 



: n. 



Pz(i)p y {i) + Pz{if + Py{i) > l-3Ln 2/5 log \ 

In this case, A£(z) stochastically dominates X(i). This follows from (4.8) in the case when £ > 
and trivially when £ = 0. 

For any io < i < i en d we have 



= -6L?i- 2 / 5 log 



n 



and 



Var[X:(i)|.Fi-i] = 2iw(i) + 4Ln~ 2/5 log 



n. 



We will split the time range io to i en d into d = log n many chunks of length n 3 / 5 log 2 n. Recall 
that iq = n — re 3 ' 5 and for all 1 < £ < d define 



H = H- 



_i + n 3/5 log 2 n. 



For < £ < d, we define a martingale starting at ig to be 

k 



(X^-ElXii)]^]). 

i=il+l 



Then for < £ < d we have 



Se(t£+i) 



X(i) ] +6Ln 1 / 5 log~ 1 



n. 



We also have that 



U+l 



n 



V e := Var[X{i)\JF^ x ] = G (n 2 / 5 log- 

Further, using the fact that for the expression for Vg is completely deterministic, we may choose 
j(n,£) such that j(n,£) < V? < 7(71, £) + 0(7(71, £)). By using the facts about a(t) and b(t) presented 
in Section 4, we may take j(n,£) = C^n 2 / 5 log -2 n for come constant Cg_. Note here that there is 
an absolute constant c such that Cg<c for all < £ < d. 

Hence applying Lemma 7.2 to Se with stopping time ie+i, we see that 



AT(0,1). 



v/Qn 2 / 5 log" 2 

So there exists some constant po > such that for each < £ < d (and n sufficiently large) 



+6Ln 1 / 5 log- 1 n > 6L ± x 
^/C e n 2 / 5 log" 2 n ~ Vc 



> Po 
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So we get that 

P [VO < £ < d, C(</) < n^log- 1 



n 



< 



n+i 



V0<£<d, A^(i) < n^log- 1 



i=il+l 



<(l-Po) l ° gn 

=0(1). 

So we know that w.h.p. there is a point % where C(*&) > log -1 n - We would like to show that 
after n 3 / 5 log -3 n steps, £ has not decreased below \n 1 ^ log -1 n. To prove this, we consider the 
martingale 

k 

S b (k) = n 1 / 5 log" 1 n + (*(*) " E W)\Fi-l\) ■ 

i=i b 

Let i c = % + n 3//5 log -3 n. Then 

i c 

Var[X(i)\T l . 1 ] = Q( n 2 ^log 

i=i b +l 



n 



By applying Lemma 5.4 to this martingale, we have that after n 3//5 log 3 n steps, 



3i :i h <i<i c , C(0 < 2 ral/5l °g _1 



< 



3» <i c : S b (i) < V/Mog- 1 



7? 



< exp ^—Q 

< o(l). 



n 



2/5 lQ 



n 



n 2 / 5 log 3 n(l+n 1 / 5 log 2 n) 



So we know that whp, > ^n 1 / 5 log -1 n for ^ < i < i c . In this time, the algorithm is likely to 
delete an edge adjacent to a ( vertex. Formally, we have that there exists some go such that for all 

Pz(i)P((i) >qo = V (n~ 2 / b log- 1 nj 

so that if W is a random variable representing the number of i between i b and i c when 5(i) = £, 
then W stochastically dominates Bin(n 3//5 log -3 n, qo). 



n 



£[Bin(?i 3/5 log -3 n, q )} = fi (n 1/5 log" 
so an application of the Chernoff bound tells us that, w.h.p., W = n (n 1 / 5 log" 4 n) . 



□ 
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